High Performance Direct Gravitational N - body Simulations on Graphics Processing Units An implementation in CUDA

نویسنده

  • Jeroen Bédorf
چکیده

At the end of 2006 NVIDIA introduced a new generation of graphical processing units (GPUs) (the so called G80 architecture). These GPUs are more powerful than any of the GPUs released before; they offer up to 350 billion floating-point operations per second (GFLOP/s) in certain situations. With the introduction of this hardware NVIDIA released a new programming environment that makes it easier for programmers to use the GPU for other tasks than graphics. This software environment is called Compute Unified Device Architecture (CUDA) and interacts closely with the hardware characteristics of the G80 architecture. In this thesis CUDA is used to implement an N -body algorithm that is suitable for astrophysical direct N -body integrations. The performance of the GPU will be compared with the GRAPE-6Af special hardware. To compare the GPU and the GRAPE we have implemented our algorithm in a library that is compatible with the GRAPE library. This library is linked with the starlab software package to compare performance and relative error between the GPU and the GRAPE. Results show that the GPU can match the performance of the GRAPE and achieve a comparable error when a relative large energy error is acceptable. For high precision simulations the GRAPE outperforms the GPU, because the double precision on the GRAPE hardware allows the algorithm to take larger time steps while maintaining a higher accuracy. The GPU hardware is single precision, but double precision GPUs have been announced in various NVIDIA documents. The GPU code can easily be adapted for other N -body algorithms by changing the equations that are used. This enables researchers from other research areas to speed up their codes by several orders of magnitude at a relative low cost.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Performance Direct Gravitational N-body Simulations on Graphics Processing Units II: An implementation in CUDA

We present the results of gravitational directN -body simulations using the Graphics Processing Unit (GPU) on a commercial NVIDIA GeForce 8800GTX designed for gaming computers. The force evaluation of the N -body problem is implemented in “Compute Unified Device Architecture” (CUDA) using the GPU to speed-up the calculations. We tested the implementation on three different N -body codes: two di...

متن کامل

Parallel Implementation of Particle Swarm Optimization Variants Using Graphics Processing Unit Platform

There are different variants of Particle Swarm Optimization (PSO) algorithm such as Adaptive Particle Swarm Optimization (APSO) and Particle Swarm Optimization with an Aging Leader and Challengers (ALC-PSO). These algorithms improve the performance of PSO in terms of finding the best solution and accelerating the convergence speed. However, these algorithms are computationally intensive. The go...

متن کامل

A fully parallel, high precision, N-body code running on hybrid computing platforms

We present a new implementation of the numerical integration of the classical, gravitational, N -body problem based on a high order Hermite’s integration scheme with block time steps, with a direct evaluation of the particle-particle forces. The main innovation of this code (called HiGPUs) is its full parallelization, exploiting both OpenMP and MPI in the use of the multicore Central Processing...

متن کامل

Compute Unified Device Architecture ( CUDA ) Based Finite - Difference Time - Domain ( FDTD ) Implementation

Recent developments in the design of graphics processing units (GPUs) have made it possible to use these devices as alternatives to central processor units (CPUs) and perform high performance scientific computing on them. Though several implementations of finitedifference time-domain (FDTD) method have been reported, the unavailability of high level languages to program graphics cards had been ...

متن کامل

Numerical Simulation of a Lead-Acid Battery Discharge Process using a Developed Framework on Graphic Processing Units

In the present work, a framework is developed for implementation of finite difference schemes on Graphic Processing Units (GPU). The framework is developed using the CUDA language and C++ template meta-programming techniques. The framework is also applicable for other numerical methods which can be represented similar to finite difference schemes such as finite volume methods on structured grid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007